knitr::opts_chunk$set(echo = TRUE)
library(knitr)
library(tidyverse)
library(dplyr)
library(sf)
library(sp)
library(tidycensus)
library(RSocrata)
library(spdep)
library(spatstat)
library(RColorBrewer)
library(viridis)
library(classInt)
library(reshape2)
library(FNN)
library(tm)
library(geojsonsf)
library(stats)
library(patchwork)
library(ggthemes)
library(gridExtra)
library(tmap)
library(stats)
library(patchwork)
library(ggcorrplot)
library(cowplot)
library(GWmodel)
library(gtable)
library(grid)
library(spgwr)

options(scipen=999)

Introduction

Context

The “Opioid epidemic” is a global health crisis (Krausz et al., 2021). The effectiveness of this drug class in pain treatment, modern anaesthesia and palliative care (Rosenblum et al., 2008, Rosner et al., 2019) is widely regarded in the medical world. Unfortunately, the substance is highly addictive (Waldhoer et al. 2004). Among the many side effects of opioid usage, such as digestive disorders, constipation and depression (Adams, 1889), the most notable is respiratory depression, which is the main cause of opioid overdose related deaths (Montandon and Slutsky, 2019). In the United States, there has been widespread over-prescription of opioids such as OxyContin since the 1990s (Lyden and Binswanger, 2019). This has led to increased dependency on synthetic opioids. Opioid overdose-related hospitalisations increased 64% between 2005 and 2014, and in 2016 over 42,000 Americans died from opioid overdoses (Lyden and Binswanger, 2019). Governments on the local and national levels of the country are actively pursuing solutions to mitigate the risks that this crisis poses to society.

There has been a great deal of research undertaken to understand the socio-economic and environmental factors exacerbating opioid overdoses and related deaths. Socio-economic marginalisation is an important contributor to opioid overdoses amongst the people using them (Draanen et al., 2020). Several studies have found opioid abuse to be more common amongst people recently released from prison (Brinkley-Rubinstein et al., 2018), those living in poverty (Marshall et al., 2019) and the unemployed (Hollingsworth et al., 2017), whereas higher measures of educational achievement (Marshall et al. 2019) and social support in the form of healthy relationships (Burns et al., 2004) have been found to be negatively associated with overdose cases. All of these factors point to the increasing need for policy to uplifting users of opioids with social and economic insecurity (Draanen et al., 2020) and using these associated risk factors to predict potential areas for future overdose prevalence.

While many of the mentioned studies focused on linear regression models and multivariate analysis to get to these conclusions, they often failed to take into consideration the spatial relationship that opioid overdoses have with these variables. Studies that have accounted for spatial variability have been able to highlight high levels of clustering facilitating the identification of areas which require immediate policy attention. For example, a study in Cincinnati, Ohio, not only found hotspots of heroin-related incidents via spatial clustering, but what variables were common amongst them (Choi et al., 2022). These hotspots were often found in areas with lower education rates, higher poverty rates and higher male percentages, consolidating some of the key findings in existing literature. A previous study in Cincinnati implemented spatial regression methods to factor in characteristics of the built environment as well as socio-economic variables, something that may have been more difficult to accomplish without the implementation of location data and spatial analysis (Li et al. 2019). While the proportion of parks, manufacturing districts and commercial sectors was positively associated with heroin-related emergency calls, there was a negative association seen with proximity to fast food restaurants, hospitals and opioid-treatment programs. Overall, advancements in spatial analytics have significantly progressed research in this field, with the aim to direct change in areas most needed and identify potential characteristics of areas that are most susceptible.

Adding to the previous work in this field, we aim to spatially characterise the risk factors associated with overdoses in the City of Mesa, located in the state of Arizona. Mesa is an ideal study site as the Arizona Department of Health Services has stated that every day over 5 people die from opioid overdoses within the state (Opioid Prevention, 2022). The governor of Arizona himself, Doug Ducey, has stated clearly how Arizona, and the nation as a whole is facing a public health emergency which requires targeted solutions to alleviate pain (Office of Governor Ducey, 2017). The City of Mesa publishes publicly available data on overdose incidents, providing an opportunity to determine the key characteristics of overdose hotspots which will inform future policymaking.

Research Objectives and Hypotheses

The objective of our research is to improve the targeting of interventions against heroin overdoses in Mesa. We aim to identify how demographic, socioeconomic, and geographic factors are associated with overdose deaths in order to inform the City about where to take proactive measures to prevent more incidences. To achieve this, we model the relationship between various potential risk factors and the number of overdoses using spatial regression.

We hypothesize that there will be higher overdose mortality in areas with greater minority and low-income populations. Our predictions are based on studies showing that heroin-related incidents were negatively correlated with median household income in Cincinnati (Li et al., 2019) and that incidents were growing most rapidly amongst Black and Latinx communities across US metropolitan areas (Lippold et al., 2019). We also predict that characteristics of the built environment such as distance from greenspace, zoning assignments, or indications of plight based on the aforementioned study (Li et al., 2019). Finally, we expect a lower number of incidents with regions of lower crime rates in accordance to results in similar work (Choi et al., 2022).

Data

maricopa_crs <- 'EPSG:2223'
census_api_key("0f5f47c1f6197eb4d9a922410ed9dfb8af738b50", overwrite = TRUE) 

#Boundary/outline of mesa

mesa_boundary <- st_as_sf(st_read("../City_Boundary.csv"), wkt = 'Geometry', crs = 4326, agr = 'constant')%>%st_transform(maricopa_crs)

#------------------------------------  Census Data ----------------------------------------------#

#Census Block group data: geometries + sociodemographic data

variables_of_interest2 <- c("pop" = "B01003_001",
                            "white_pop" = "B03002_003",
                            "black_pop" = "B03002_004",
                            "hispanic_pop" = "B03002_012",
                            "median_income" = "B19013_001",
                            "unemployed" = "B23025_005",
                            "total_labor_force" = "B23025_003",
                            "owner_occupied_homes" = "B25003_002",
                            "renter_occupied_homes" = "B25003_003",
                            "vacant_homes" = "B25002_003",
                            "total_homes" = "B25002_001"
)

mesa_cb_groups <- get_acs(geography = "block group", variables = variables_of_interest2, 
                          year=2020, survey = "acs5", state=04, county=013, geometry=TRUE) %>% st_transform(maricopa_crs)

out_of_bounds <- c('040139413004', '040139413002', '040130101021', '040138169001', '040139413003', '1040133184002')

mesa_cb_groups<- mesa_cb_groups[ ! mesa_cb_groups$GEOID %in% out_of_bounds, ]
mesa_cb_groups <- mesa_cb_groups[mesa_boundary,]

mesa_cb_table <- mesa_cb_groups %>%
  dplyr::select( -NAME, -moe) %>% spread(variable, estimate) %>%
  mutate(area = st_area(geometry),
         perc_white = white_pop/pop*100,
         perc_black = black_pop/pop*100,
         perc_hispanic = hispanic_pop/pop*100,
         perc_unemployed = unemployed/total_labor_force*100,
         perc_owners = owner_occupied_homes/total_homes*100,
         perc_renters = renter_occupied_homes/total_homes*100,
         perc_vacant = vacant_homes/total_homes*100)

mesa_cb_table$perc_vacant[mesa_cb_table$perc_vacant >100] <- 100

mesa_cb_table <- na.omit(mesa_cb_table)

#----------------------------------  Overdose Dataset  ------------------------------------------#


overdoses <- st_as_sf(na.omit(read.socrata("https://data.mesaaz.gov/resource/qufy-tzv6.json")), 
                      coords = c("longitude", "latitude"), crs = 4326, agr = "constant") %>% 
  st_transform(st_crs(maricopa_crs))

overdoses18_20 <- overdoses[overdoses$year == 2018 | overdoses$year == 2019 | overdoses$year == 2020, ]

overdose_intersect <- overdoses18_20 %>% st_join(., mesa_cb_table, join=st_within) %>%
  group_by(GEOID) %>% summarize(count = n()) %>% st_drop_geometry() 
colnames(overdose_intersect)[2] <- "overdose_count"

mesa_cb_table <- left_join(mesa_cb_table, overdose_intersect, by="GEOID")
mesa_cb_table$overdose_count[is.na(mesa_cb_table$overdose_count)] <- 0

#-------------------------------------  Zoning Dataset --------------------------------------------#

zoning_parcels <- st_read("../Zoning Districts.geojson")%>%st_transform(st_crs(maricopa_crs))

zones <- c("resid_high", "resid_low", "CBD", "commercial")

zoning_parcels$zoning[str_detect(zoning_parcels$zoning, "^RM")] <- "resid_high"
zoning_parcels$zoning[str_detect(zoning_parcels$zoning, "^RS")] <- "resid_low"
zoning_parcels$zoning[str_detect(zoning_parcels$zoning, "^D")] <- "CBD"
commercial_list <- c("GC","LC","NC","OC")
zoning_parcels$zoning[zoning_parcels$zoning %in%commercial_list] <- "commercial"
zoning_parcels <- zoning_parcels[zoning_parcels$zoning %in% zones, ]
zoning_parcels <- zoning_parcels[c("zoning","geometry")]

mesa_cb_sub <- subset(mesa_cb_table, select = c(GEOID, geometry, area))
zoning_intersection <- st_intersection(mesa_cb_sub, st_make_valid(zoning_parcels))
zoning_intersection$inter_area <- st_area(zoning_intersection$geometry)

zoning_intersection <- aggregate(zoning_intersection$inter_area, 
                                 by=list(zoning_intersection$GEOID,zoning_intersection$zoning), FUN=sum)

zoning_intersection<-dcast(zoning_intersection, Group.1~Group.2)

zoning_intersection[is.na(zoning_intersection)] <- 0
colnames(zoning_intersection)[1] <- "GEOID"

mesa_cb_table <- left_join(mesa_cb_table, zoning_intersection,by="GEOID")
mesa_cb_table[is.na(mesa_cb_table)] <- 0

mesa_cb_table <- mesa_cb_table %>% mutate(
  cbd_per = as.numeric(CBD/area*100),
  resid_high_per = as.numeric(resid_high/area*100),
  resid_low_per = as.numeric(resid_low/area*100),
  commercial_per = as.numeric(commercial/area*100))

columns_to_keep <- c("GEOID", "pop", "area", "overdose_count", "median_income", "perc_white", "perc_black", "perc_hispanic", 
                     "perc_unemployed", "perc_owners", "perc_renters", "perc_vacant",
                     "cbd_per", "resid_high_per", "resid_low_per", "commercial_per", "geometry")

mesa_cb_updated <- mesa_cb_table[columns_to_keep]


#------------------------------  City Property + Graffiti Dataset ------------------------------------------#

city_prop<- st_as_sf(na.omit(read.socrata("https://data.mesaaz.gov/resource/xms2-ya86.json")),
                     coords = c("longitude", "latitude"), crs = 4326, agr = "constant")%>%st_transform(st_crs(maricopa_crs))

city_prop <- city_prop %>% distinct(address, .keep_all = TRUE)

# public safety + greenspace + vacant property + community dev + graffiti

safety <- city_prop%>%filter(property_use %in% c("Public Safety--Fire/Police", "Park/Public Safety"))%>%
  dplyr::select(property_use, geometry)
safety$property_use <- "public_safety"

green<- city_prop%>%filter(property_use %in% c("Parks", "Pocket Park", "Cemetery"))%>%
  dplyr::select(property_use, geometry)
green$property_use <- "public_reenspace"

vacant_city<- city_prop%>%filter(property_use %in% c("Vacant", "Vacant (ADOT remnant)"))%>%
  dplyr::select(property_use, geometry)
vacant_city$property_use <- "vacant_property"

community <- city_prop%>%filter(
  property_use %in% c("Libraries", "Mesa Arts Center","Museums","Child Crisis Center", "Sequoia Charter School", "Senior Center"))%>%
  dplyr::select(property_use, geometry)
community$property_use <- "comm_development"


graffiti_calls <- st_as_sf(na.omit(read.socrata("https://data.mesaaz.gov/resource/9spb-749m.json")),
                           coords = c("lon", "lat"), crs = 4326, agr = "constant")%>%st_transform(st_crs(maricopa_crs))

graffiti_calls$nearest_address <- removeNumbers(graffiti_calls$nearest_address)
graffiti_calls$date_reported <- as.Date(graffiti_calls$date_reported)
graffiti_calls$property_use <- "graffiti"

graffiti_calls <- graffiti_calls %>% distinct(date_reported, nearest_address, .keep_all = TRUE) %>% 
  filter(between(date_reported, as.Date('2018-01-01'), as.Date('2021-01-01'))) %>%
  dplyr::select(property_use, geometry)


mesa_cb_updated <- 
  rbind(safety, green, vacant_city, community, graffiti_calls) %>%
  st_join(., mesa_cb_updated, join=st_within) %>%
  st_drop_geometry() %>%
  group_by(GEOID,property_use) %>%
  summarize(count = n()) %>%
  full_join(mesa_cb_updated) %>%
  spread(property_use, count, fill=0) %>%
  st_sf() %>%
  dplyr::select(-`<NA>`) %>%
  na.omit() %>%
  ungroup()


### nearest neighbor function from Public Policy Analytics Textbook

nn_function <- function(measureFrom,measureTo,k) {
  measureFrom_Matrix <-
    as.matrix(measureFrom)
  measureTo_Matrix <-
    as.matrix(measureTo)
  nn <-   
    get.knnx(measureTo, measureFrom, k)$nn.dist
  output <-
    as.data.frame(nn) %>%
    rownames_to_column(var = "thisPoint") %>%
    gather(points, point_distance, V1:ncol(.)) %>%
    arrange(as.numeric(thisPoint)) %>%
    group_by(thisPoint) %>%
    summarize(pointDistance = mean(point_distance)) %>%
    arrange(as.numeric(thisPoint)) %>% 
    dplyr::select(-thisPoint) %>%
    pull()
  return(output) 
}

### finding distance (nearest neighbors) to point data

st_c <- st_coordinates
st_coid <- st_centroid

mesa_cb_updated <-
  mesa_cb_updated %>%
  mutate(
    dist_public_safety =
      nn_function(st_c(st_coid(mesa_cb_updated)), st_c(safety),3),
    dist_greenspace =
      nn_function(st_c(st_coid(mesa_cb_updated)), st_c(green),3),
    dist_vacant_prop =
      nn_function(st_c(st_coid(mesa_cb_updated)), st_c(vacant_city),3),
    dist_comm_dev =
      nn_function(st_c(st_coid(mesa_cb_updated)), st_c(community),3),
    dist_graffiti =
      nn_function(st_c(st_coid(mesa_cb_updated)), st_c(graffiti_calls),3))


#----------------------------------  Police Incidents Dataset ---------------------------------------#


incidents <- st_as_sf(na.omit(read.socrata("https://data.mesaaz.gov/resource/39rt-2rfj.json")),
                      coords = c("longitude", "latitude"), crs = 4326, agr = "constant")%>%
  st_transform(st_crs(maricopa_crs))%>%
  filter(report_year == '2018' | report_year == '2019' | report_year == '2020')
incidents <- incidents[mesa_boundary,]

notapp <- c("DEATH", "DUI", "FOUND", "COLLISION", "INSURANCE", "LICENSE", "LOST", "PARKING", "MEDICAL", 
            "SUSPENSION", "DOG", "ANIMAL", "SPEED", "IGNITION", "REGISTRATION", "FAIL", "MENTALLY")
incidents <- incidents[ ! grepl(paste(notapp, collapse = "|"), incidents$crime_type), ]
incidents <- incidents[c("crime_id","geometry")]

incidents_intersect <- incidents %>% st_join(., mesa_cb_updated, join=st_within) %>%
  group_by(GEOID) %>% summarize(count = n()) %>% st_drop_geometry() 
colnames(incidents_intersect)[2] <- "arrests_count"

mesa_cb_final <- left_join(mesa_cb_updated, incidents_intersect, by="GEOID")
mesa_cb_final$arrests_count[is.na(mesa_cb_final$arrests_count)] <- 0

#st_write(mesa_cb_final, "../mesa_complete_data4.geojson", append=FALSE)

United States Census Bureau Data

We obtained all our demographic and socioeconomic data from the American Community Survey (ACS) which is an annual data collection program run by the United States Census Bureau. ACS is the premier source for detailed demographic and housing information in the United States. We obtained the most recent 5-year estimates (2020) on the census block group level which is the smallest geographical unit for which the Census Bureau provides data. There are 345 census block groups in Mesa with a substantial population. The information we collected includes the total population, the white, Black, and Hispanic population, the median household income, the total unemployed and the total labor force, the number of homeowners, the number of renters, the number of vacant properties, and the total number of properties. For all our main variables excluding income, we calculated the statistic as a proportion to account for varying population sizes across the census block groups.

City of Mesa Open Data

The majority of our data sets come from the The City of Mesa Data Portal maintained by the municipal government. Our primary dataset is the “Fire and Medical Opioid Overdose Incidents”, which contains all confirmed cases of opioid overdoses and their locations rounded to the 1/3 mile increments. The overdose was either confirmed by the patient or witness, an opioid found on scene, or a positive response to Narcan treatment (City of Mesa). We obtained the number of overdose incidents from 2018 to 2020 and aggregated them by census block group.

We also collected the “Zoning Districts” data from the portal. The dataset contains delineated geographic areas in the city and their land use assignments. We specifically extracted land parcels assigned as high-density residential, low-density residential, and commercial. Additionally, we utilized the “City Owned Property” dataset which lists the properties owned or sold by the city and their locations. From this dataset we obtained point data on the location of parks and cemeteries (which we labelled as greenspace), libraries, arts centers, museums, crisis centers, and senior centers (which we labelled as community development resources), fire department and police stations (which we labelled as public safety), and municipal vacant property.

To investigate the degree of plight in neighborhoods, we used the “Transportation Graffiti” dataset which includes graffiti reports in the city. We obtained reports from 2018 to 2020 and filtered them by the date and location in order to collect only distinct reports.

To establish a smoother relationship to factors of our point datasets across space, we calculated average nearest neighbor features. Adapting code from Ken Steif’s Public Policy Analytics we calculated the distance from the centroid of each census block group to the 3 nearest factor points (Steif, 2021).

Finally, to investigate crime rates, we utilized the “Police Incidents” dataset which includes incidents based on police reports and arrests and their locations rounded to the nearest hundred block. We excluded all non-crime-related incidents such as car collisions, medical emergencies, or animal-related issues. We also excluded crimes that were related to driving such as DUIs, or missing driver’s license, or speeding, because of the uncertainty about if the location of the incident is associated with the individual charged.

List of Potential Risk Factors

After feature engineering, our list of potential independent variables include population, White population (as percent), Hispanic population (as percent), Black Population (as percent), median household income, unemployed population (as percent), homeowner population (as percent), renter population (as percent), vacant homes (as percent), land zoned as high-density residential (as percent), land zoned as low-density residential (as percent), land zoned as commercial (as percent), number of crime-related police incidents, average distance to the 3 nearest: graffiti reports, community development centers, greenspace, public safety, and vacant property.

Data Exploration

maricopa_crs <- 'EPSG:2223'
mesa_dataset<- st_read("../mesa_complete_data4.geojson")%>%st_transform(st_crs(maricopa_crs))

all_cols <- c("GEOID", "overdose_count", "pop", "arrests_count", "median_income",
               "perc_white", "perc_black", "perc_hispanic", 
               "perc_unemployed", "perc_owners", "perc_renters", "perc_vacant",
               "resid_high_per", "resid_low_per", "commercial_per",
               "dist_public_safety", "dist_greenspace", "dist_vacant_prop", "dist_comm_dev", "dist_graffiti",
              "geometry")

ind_vars <- c("pop", "arrests_count", "median_income",
               "perc_white", "perc_black", "perc_hispanic", 
               "perc_unemployed", "perc_owners", "perc_renters", "perc_vacant",
               "resid_high_per", "resid_low_per", "commercial_per",
               "dist_public_safety", "dist_greenspace", "dist_vacant_prop", "dist_comm_dev", "dist_graffiti")

Overdoses

Overdose Distribution

The chart below shows the frequency distribution of opioid overdose for Mesa census blocks. We find that the distribution of overdoses is heavily right-skewed. The mean and median of overdose cases are 5.1 and 3 respectively. However, a few census blocks have a relatively large number of overdose cases compared to others. It may represent that opioid overdose in Mesa may be more prevalent in several areas compared to most places where there are only a few cases. As a result, our modelling of the overdose counts will have to account for the prevalence of census block groups with no cases.

#histogram of overdoses
ov_hist <- ggplot(data=mesa_dataset, aes(overdose_count)) + 
  geom_histogram()+ ggtitle("Frequency Distribution of Overdoses") + 
  labs(y = "Count of Census Block Groups", x = "Number of overdoses")+
  theme_minimal()

ov_hist

The following map reveals the spatial distribution of opioid overdose among Mesa census blocks from 2018 to 2020. By observation, the census blocks with a more significant number of overdose cases tend to cluster. For example, they tend to cluster in the west of Mesa and the centre part of Mesa. On the other hand, those with a relatively lower number of cases also show a general pattern of clustering towards the North region and the South part of Mesa. Therefore, it is worth studying if those with higher values tend to cluster statistically, and so do those with lower values. Global and Local Moran’s I will be performed to test this hypothesis statistically.

mesa_cb_chor <- mutate(mesa_dataset, overdose_cat = cut(overdose_count, breaks = c(-.000001, 1, 3, 6, 10, 14, 45))) 

pal <- brewer.pal(6, "OrRd")
plot(mesa_cb_chor[, "overdose_count"],
     breaks = c(-.000001, 1, 3, 6, 10, 14, 45),
     pal = pal,
     main = "Overdoses in Mesa (2018-2020) by Census Block Group")

Global Autocorrelation

Global Moran’s I is performed to check if autocorrelation of opioid overdoses exists globally. We find that at teh census block group level, counts of overdose incidents display significant positive autocorrelation.

mesa_nb <- poly2nb(mesa_dataset, queen=TRUE)

mesa_weights <- nb2listw(mesa_nb, style="W", zero.policy=TRUE)


#global autocorrelation results
moran.test(mesa_dataset$overdose_count, mesa_weights)

Local Autocorrelation

Next, we investigate local autocorrelation using Local Moran’s I. The Moran scatterplot below shows the overdose count of each census block is compared to the neighbour’s weighted counts. The scatterplot reveals that majority of autocorrelation is from high overdose values neighboring high values (top right). However, there are also several instances of low values neighboring low values (bottom left).

# Moran Scatterplot

moran.plot(mesa_dataset$overdose_count, mesa_weights,
           xlab="Overdose Counts", ylab="Spatailly lagged Overdose Counts")

We map the various clusters to examine the spatial distribution of the autocorrelation. The result shows that in most of the city the autocorrelation is not significant. However, there is significant high-high autocorrelation in the west and center of the city. There is one case of low-low autocorrelation in the northeast of the city.

# Morans Clusters
moranCluster <- function(shape, W, var, alpha=0.05)
{
  # Code adapted from https://rpubs.com/Hailstone/346625
  Ii <- localmoran(shape[[var]], W)
  shape$Ii <- Ii[,"Ii"]
  shape$Iip <- Ii[,"Pr(z != E(Ii))"]
  shape$sig <- shape$Iip<alpha
  # Scale the data to obtain low and high values
  shape$scaled <- scale(shape[[var]]) # high low values at location i
  shape$lag_scaled <- lag.listw(mesa_weights, shape$scaled) # high low values at neighbours j
  shape$lag_cat <- factor(ifelse(shape$scaled>0 & shape$lag_scaled>0, "HH",
                                 ifelse(shape$scaled>0 & shape$lag_scaled<0, "HL",
                                        ifelse(shape$scaled<0 & shape$lag_scaled<0, "LL",
                                               ifelse(shape$scaled<0 & shape$lag_scaled<0, "LH", "Equivalent")))))
  shape$sig_cluster <- as.character(shape$lag_cat)
  shape$sig_cluster[!shape$sig] <- "Non-sig"
  shape$sig_cluster <- as.factor(shape$sig_cluster)
  results <- data.frame(Ii=shape$Ii, pvalue=shape$Iip, type=shape$lag_cat, sig=shape$sig_cluster)
  
  return(list(results=results))
}

clusters <- moranCluster(mesa_dataset, W=mesa_weights, var="overdose_count")$results
mesa_dataset$Ii_cluster <- clusters$sig

tm_shape(mesa_dataset) + tm_polygons(col="Ii_cluster", palette = "Set3", title = "Clusters") + tm_layout(title= 'Spatial Autocorrelation', fontface = 2)

Examining Independent Variables

Spatial Distributions

We explored the spatial distribution of our potentnial independent variables by mapping them as shown below.

Our results show that there are several trends associated with the western part of the city. In this part of the city, we find the most extreme number of arrests, the largest proportions of Hispanic residents and lowest proportions of White residents, as well as the lowest proportions of homeowners. In general, we find that most of the city is zoned for low-density residential housing while most of the commercial and high-density residential zoning is in the northwest or across the central part of the city closer to the downtown. Most of the census block groups with large amounts of areas zoned as vacant land are in the eastern part of Mesa. On the other hand, most of the city-owned vacant properties (no inhabitants) are closer to the west of the city. We find that the average distance to greenspace, public safety facilities and graffiti are generally similar across the cities except in the northeast and southeast of the city which are also the areas of the highest incomes. Distance to community development resources increasingly enlarges west of the city. However, this is expected given that the downtown is near the east. Finally, we find an extreme outlier in one census block group in the center of the city with close to a 100% unemployment rate which we will remove from our data set.

ind_vars_table <- mesa_dataset[ind_vars]

vars_net.long <-
  gather(ind_vars_table, Variable, value, -geometry)

mapTheme<- function(base_size = 15, title_size = 20) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = title_size, colour = "black"),
    plot.caption=element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),axis.title = element_blank(),
    axis.text = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_blank(),
    strip.text.x = element_text(size = 10),
    legend.key = element_rect(fill=NA))
}

vars <- unique(vars_net.long$Variable)
mapList <- list()

#map risk factors
for(i in vars){
  mapList[[i]] <- 
    ggplot() +
    geom_sf(data = filter(vars_net.long, Variable == i), aes(fill=value), colour=NA) +
    scale_fill_viridis(option = "A", name="") +
    labs(title=i) +
    mapTheme() +
    theme(plot.title = element_text(size=10))}
do.call(grid.arrange,c(mapList, ncol = 5, top = "Spatial Distribution of Independent Variables"))

Frequency Distributions

We also explored the frequency distributions of the potential independent variables. Most of the distributions are heavily right-skewed or left skewed. Several of variables have several observations in the zero bucket because many census block groups do not have areas that zoned for activities other than low-density housing.

mesa_nogeo <- mesa_dataset
st_geometry(mesa_nogeo) <- NULL

mesa_nogeo <- mesa_nogeo[ind_vars]

data_long <- mesa_nogeo %>% pivot_longer(colnames(mesa_nogeo))%>%as.data.frame()

ggplot(data_long, aes(x = value)) + geom_histogram() + theme(axis.text.x = element_text(angle = 45)) + theme_void() + facet_wrap(~ name, scales = "free")

Correlations Between Potential Risk Factors

We examine the correlations between the independent variables to check for possible multicollinearity before running our regressions. As expected, some variables are highly correlated such as the proportion of Hispanic residents and White residents or the proportion of homeowners and renters. We find that the proportion of White residents and the proportion of renters are highly negatively correlated with each other but also correlated with several other potential risk factors, so we decide to drop them from our list of independent variables. Several of our distance-related risk factors are also highly positively correlated like the distance to public safety, greenspace and community development resources. This situation is likely because these are all municipal facilities so they are all located closer to the city center.

cormat <- round(cor(mesa_nogeo),2)
ggcorrplot(cormat)

Correlations Between with Overdose Counts and Potential Risk Factors

Finally, we explore the relationship between the potential risk factors and the number of overdoses in each census block group. We find that there are some variables with very low linear correlations to the number of overdoses such as the population, the percent of land zoned as low-density residential, the pecent of land zoned as vacant, and the unemployment rate. On the other hand, variables such as the number of arrests and the percent of homeowners are much more correlated with overdoses. The scatterplots show that several of relationships do not appear to follow a linear pattern, but this phenomenon might be influenced by the strong spatial relationships we previously observed.

vars_plus <- append(ind_vars,"overdose_count")

mesa_corr <- mesa_dataset
st_geometry(mesa_corr) <- NULL
mesa_corr <- mesa_corr[vars_plus]

mesa_corr <- mesa_corr %>% gather(Variable, Value, -overdose_count) %>%
  mutate(Value = as.numeric(Value))

correlation.cor <-
  mesa_corr %>%
    group_by(Variable) %>%
    summarize(correlation = cor(Value, overdose_count, use = "complete.obs"))
    
ggplot(mesa_corr, aes(Value, overdose_count)) +
  geom_point(size = 1) +
  geom_text(data = correlation.cor, aes(label = paste("r =", round(correlation, 2))),
            x=-Inf, y=Inf, vjust = 1.5, hjust = -.1) +
  geom_smooth(method = "lm", se = FALSE, colour = "black") + theme_void() +
  facet_wrap(~Variable, ncol = 6, scales = "free") +
  labs(title = "Correlations between Number of Overdoses and Risk Variables \n
        ") + 
  theme(strip.text.x = element_text(size = 10))

Modelling

After taking into account multicollinearity and the relationship with the overdose count, the variables chosen for the model are: median household income, Black population (as percent), Hispanic population (as percent), number of police incidents, homeowner population (as percent), land zoned as commercial (as percent), land zoned as high-density residential (as percent), average distance to vacant property, average distance to graffiti reports, average distance to greenspace and average distance to public safety.

maricopa_crs <- 'EPSG:2223'
mesa_dataset<- st_read("../mesa_complete_data4.geojson")%>%st_transform(st_crs(maricopa_crs))
all_vars <- c("overdose_count", "arrests_count", "median_income", "perc_black", "perc_hispanic","perc_owners",
              "resid_high_per","commercial_per", "dist_greenspace", "dist_vacant_prop", "dist_graffiti",
              "dist_public_safety")

ind_vars <- c("arrests_count", "median_income", "perc_black", "perc_hispanic", "perc_owners",
              "resid_high_per","commercial_per", "dist_greenspace", "dist_vacant_prop", "dist_graffiti",
              "dist_public_safety")

mesa_all <- mesa_dataset[all_vars]

OLS and Poisson Baseline Models

By Ayina Anyachebelu

Methodology

We first build our baseline models using Ordinary Least Squares and Poisson Regression. The formula for the linear regression is as follows:

\[ y_i = \beta_0 + \beta_1x_{i1} + \beta_2x_{i2} +... + \beta_nx_{in} + \epsilon_i \] where \(y_i\) is the \(i\)th observation of number of overdoses, \(x_i\) corresponds with each of our n potential risk factors, \(\beta_0\) represents the intercept term, \(\beta_n\) is the nth estimated parameter, and \(\epsilon\) is the error term.

Considering our dependent variable consists of non-negative overdose counts, we also use a Poisson regression with the following equation:

\[ ln(y_i) = \beta_1x_{i1} + \beta_2x_{i2} +... + \beta_nx_{in} + \epsilon_i \] where \(x_1\) = 1 so \(\beta_1\) represents our constant term and for any values of \(x_1\)\(x_n\) we can use the regression to predict the odds of an overdose happening.

Results

Below are the results of the OLS Model.

reg <- lm(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic + perc_owners +
          resid_high_per + commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti + 
            dist_public_safety, 
          data = mesa_all)
summary(reg)
## 
## Call:
## lm(formula = overdose_count ~ arrests_count + median_income + 
##     perc_black + perc_hispanic + perc_owners + resid_high_per + 
##     commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti + 
##     dist_public_safety, data = mesa_all)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.1507  -2.8612  -0.8112   1.5282  30.0266 
## 
## Coefficients:
##                        Estimate   Std. Error t value   Pr(>|t|)    
## (Intercept)         4.252008260  1.853815332   2.294   0.022433 *  
## arrests_count       0.006032666  0.001254947   4.807 0.00000232 ***
## median_income       0.000005764  0.000014174   0.407   0.684506    
## perc_black          0.070938516  0.071251288   0.996   0.320162    
## perc_hispanic       0.053501444  0.018188173   2.942   0.003494 ** 
## perc_owners        -0.003865112  0.018783864  -0.206   0.837098    
## resid_high_per      0.058222957  0.016440101   3.542   0.000455 ***
## commercial_per      0.015697167  0.039800303   0.394   0.693540    
## dist_greenspace     0.000060832  0.000141520   0.430   0.667582    
## dist_vacant_prop   -0.000052372  0.000044145  -1.186   0.236319    
## dist_graffiti       0.000315769  0.000280584   1.125   0.261229    
## dist_public_safety -0.000381148  0.000140613  -2.711   0.007064 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.585 on 333 degrees of freedom
## Multiple R-squared:  0.3365, Adjusted R-squared:  0.3146 
## F-statistic: 15.35 on 11 and 333 DF,  p-value: < 0.00000000000000022

The OLS model has an Adjusted R-squared of 0.31, suggesting our factors only explain about 30% of the variation in number of overdoses. At the 5% level, independent variables with statistically significant coefficients were the number of arrests, the percent of Hispanic residents, the percent of the area zoned as high-density residential and the distance to public safety resources. All variables except the distance to public safety resources are positively correlated with number of overdoses.

Below are the results of the Poisson regression. Using this model, in addition to the significant variables in the OLS regression, the percentage of Hispanic residents and percentage of Black residents and the distance to public greenspace were also significant at the 5% level. Particularly interesting results are that every 1-unit increase in the percentage of land zoned as high-density residential corresponds to a ~ 0.98% increase in overdose cases. Similarly every 1-unit increase in the percentage of black residents and hispanic residents corresponded to overdose case increases of 1.17% and 0.70% respectively.

Unlike our hypotheses the farther away from public greenspace and public safety resources, the lower the overdose counts. While there are no grounds for causation, the reason for this phenomenon might be the fact that many of these public facilities are nearer to the city center and farther from suburban outskirts, so more overdoses may be happening in these areas.

preg <- glm(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic + perc_owners +
          resid_high_per + commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti + 
            dist_public_safety, family = "poisson", data = mesa_all)
summary(preg)
## 
## Call:
## glm(formula = overdose_count ~ arrests_count + median_income + 
##     perc_black + perc_hispanic + perc_owners + resid_high_per + 
##     commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti + 
##     dist_public_safety, family = "poisson", data = mesa_all)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -4.9643  -1.5895  -0.4465   0.7398   6.8914  
## 
## Coefficients:
##                         Estimate    Std. Error z value             Pr(>|z|)    
## (Intercept)         2.0595832100  0.1566417221  13.148 < 0.0000000000000002 ***
## arrests_count       0.0006844624  0.0000686423   9.971 < 0.0000000000000002 ***
## median_income      -0.0000008436  0.0000013771  -0.613               0.5401    
## perc_black          0.0116940923  0.0048565931   2.408               0.0160 *  
## perc_hispanic       0.0070092185  0.0013777334   5.087          0.000000363 ***
## perc_owners         0.0009652980  0.0014383300   0.671               0.5021    
## resid_high_per      0.0098350225  0.0011898858   8.266 < 0.0000000000000002 ***
## commercial_per      0.0045587548  0.0026782416   1.702               0.0887 .  
## dist_greenspace    -0.0000579105  0.0000143220  -4.043          0.000052664 ***
## dist_vacant_prop    0.0000025045  0.0000043933   0.570               0.5686    
## dist_graffiti      -0.0000081344  0.0000421016  -0.193               0.8468    
## dist_public_safety -0.0001149590  0.0000129847  -8.853 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 2322.2  on 344  degrees of freedom
## Residual deviance: 1269.0  on 333  degrees of freedom
## AIC: 2206.1
## 
## Number of Fisher Scoring iterations: 5

The Moran’s correlation test was conducted to check for residual spatial dependence of both models. As seen below, the test resulted in a Moran’s I of 0.014 and 0.0068 respectively with corresponding Moran I statistics that were not significant on the 5% level. As a result, we cannot retain the null hypothesis that there is no spatial clustering of the residuals.

lm.morantest(reg, mesa_weights, zero.policy=TRUE)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = overdose_count ~ arrests_count + median_income +
## perc_black + perc_hispanic + perc_owners + resid_high_per +
## commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti +
## dist_public_safety, data = mesa_all)
## weights: mesa_weights
## 
## Moran I statistic standard deviate = 0.98346, p-value = 0.1627
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.0143625012    -0.0155922393     0.0009277194
lm.morantest(preg, mesa_weights, zero.policy=TRUE)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: glm(formula = overdose_count ~ arrests_count + median_income +
## perc_black + perc_hispanic + perc_owners + resid_high_per +
## commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti +
## dist_public_safety, family = "poisson", data = mesa_all)
## weights: mesa_weights
## 
## Moran I statistic standard deviate = 0.41218, p-value = 0.3401
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.006842371     -0.006272096      0.001012336

Although we cannot reject the null hypothesis based of our Moran correlation test results, an examination of the residuals of the OLS model (below), shows that their distribution appear to be non-random with large positive residuals in the west and center of the city. The poisson model has residual values of smaller magnitude which are more evenly spread out spatially. However, there still seems to be some clustering of positive residuals in the west and center of the city. For this reason, we proceed with spatial regression models.

mesa_all$res_m1 <- residuals(reg)

legend_title = expression("OLS model residuals") 
map_mesa_res = tm_shape(mesa_all) +
tm_fill(col = "res_m1", title = legend_title, palette = magma(256), style = "cont") + tm_layout(bg.color = "white")

mesa_all$res_m2 <- residuals(preg)
legend_title = expression("Poisson model residuals") 
map_mesa_res2 = tm_shape(mesa_all) +
tm_fill(col = "res_m2", title = legend_title, palette = magma(256), style = "cont")+ tm_layout(bg.color = "white")

tmap_arrange(map_mesa_res, map_mesa_res2)

Spatial Lagged Model

***By ___***

Methodology

Results

Spatial Error Model

By ____

Methodology

Results

Spatial Durbin Watson Model

By ____

Methodology

Results

Geographically Weighted Regression

By _____

Methodology

Results

Geographically Weighted Poisson Regression

By Ayina Anyachebelu

Methodology

Given that a Poisson regression better represents the relationship between the risk factors and the overdose count, we attempt to integrate this model and the spatial variation in overdose cases using a geographically weighted poisson regression which captures local relationships.

The geographically weighted poisson regression follows the form:

\[ ln(y_i) = \beta_0u_{i} + \beta_1(u_i)x_{i1} + \beta_2(u_i)x_{i2} +... + \beta_n(u_i)x_{in} + \epsilon \] where where \(y_i\) is the \(i\)th observation of number of overdoses and the \(\beta\) coefficients are functions of the location \(u_i\) designating the coordinates of the \(i\)th census block group, allowing the estimated \(\beta\) parameter to differ between census block groups (Poliart et. al, 2021).

For each census block group, the parameters were estimated by

\[ \overset{\wedge}{\beta}_i = (X^TW(u_{xi},u_{yi})X)^{-1}X^TW(u_{xi},u_{yi})Y \] where \(W(u_{xi},u_{yi})\) denotes a spatial weight matrix containing weights calculated based on the distance between the centroid of the \(i\)th census block group with coordinates (u_{xi},u_{yi}) and its spatial neighbors j (Fortheringham et al, 2003).

The spatial weight \(w_{ij}\) allows for census block groups closer to the \(i\)th census block group to have a larger influence on the parameter estimation for \(\beta_i\). The weight is based on a kernel function. We experiment with two kernel functions the Gaussian and Exponential, which are both continuous functions of the distance between observation points.

The corresponding weight matrices are calculated as follows

Gaussian: \(w_{i j}=\exp \left(-\frac{1}{2}\left(\frac{d_{i j}}{b}\right)^2\right)\) and Exponential: \(w_{i j}=\exp \left(-\frac{\left|d_{i j}\right|}{b}\right)\)

where \(h\) is the kernel bandwidth and \(d_{ij}\) is the distance between points i and j (Gollini et al., 2013).

For each kernel function, we also experiment with both a fixed kernel (which use a fixed bandwidth) and adaptive kernel (which uses a varying bandwidth based on the number of nearest neighbors from a given regression point.) Given we have census block groups of different sizes, the adaptive kernels could be useful because these kernels change in size based on the density of data. Using the fixed kernels could result in calibration on few observation points for local regression in larger units and vice versa resulting in larger standard errors (Gollini et al., 2013).

maricopa_crs <- 'EPSG:2223'
mesa_dataset<- st_read("../mesa_complete_data4.geojson")%>%st_transform(st_crs(maricopa_crs))

mesa_dataset<- st_read("../mesa_complete_data4.geojson")%>%st_transform(st_crs(maricopa_crs))

all_vars <- c("overdose_count", "arrests_count", "median_income", "perc_black", 
              "perc_hispanic","perc_owners","resid_high_per","commercial_per", "dist_greenspace", 
              "dist_vacant_prop", "dist_graffiti","dist_public_safety")
mesa_all <- mesa_dataset[all_vars]

st_c <- st_coordinates
st_coid <- st_centroid
mesa.c <- st_c(st_coid(mesa_all))

DM<-gw.dist(dp.locat=coordinates(mesa.c))
mesa_all$long <- mesa.c[,1]
mesa_all$lat <- mesa.c[,2]

We select the optimal bandwidth for each kernel by minimizing the Akaike Information Criterion to account for both prediction accuracy as well as lower complexity of our model (Gollini et al., 2015). For the fixed Gaussian and Exponential kernels the optimal bandwidths were 7614.3182847 and 6843.6505642 meters respectively. For the adaptive Gaussian and Exponential kernels, the optimal bandwidths was 18 nearest neighbors.

# FIT THE 4 GWPRs

#Gaussian non-adaptive
bgwr.gnPr <- ggwr.basic(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic +
                        perc_owners + resid_high_per + commercial_per + dist_greenspace +
                        dist_vacant_prop + dist_graffiti + dist_public_safety, 
                      data = mesa_all,
                      family = "poisson",
                      bw = bw.gwrG,
                      kernel = "gaussian", 
                      adaptive = FALSE,
                      dMat = DM)


#gaussian adaptive

bgwr.gPr <- ggwr.basic(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic +
                        perc_owners + resid_high_per + commercial_per + dist_greenspace +
                        dist_vacant_prop + dist_graffiti + dist_public_safety, 
                      data = mesa_all,
                      family = "poisson",
                      bw = bw.gwrGA, 
                      kernel = "gaussian", 
                      adaptive = TRUE,
                      dMat = DM)

# exponential non-adaptive

bgwr.nPr <- ggwr.basic(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic +
                        perc_owners + resid_high_per + commercial_per + dist_greenspace +
                        dist_vacant_prop + dist_graffiti + dist_public_safety, 
                      data = mesa_all,
                      family = "poisson",
                      bw = bw.gwrE, 
                      kernel = "exponential", 
                      adaptive = TRUE,
                      dMat = DM)


#exponential adaptive

bgwr.Pr <- ggwr.basic(overdose_count ~ arrests_count + median_income + perc_black + perc_hispanic +
                        perc_owners + resid_high_per + commercial_per + dist_greenspace +
                        dist_vacant_prop + dist_graffiti + dist_public_safety, 
                      data = mesa_all,
                      family = "poisson",
                      bw = bw.gwrEA, 
                      kernel = "exponential", 
                      adaptive = TRUE,
                      dMat = DM)

Results

We fit the 4 geographically weighted Poisson models using their corresponding optimal bandwidths. We find that the model using the exponential adaptive kernel has the lowest AIC and residual sum of squares and the highest Pseudo R-squared values. Below are the maps of the standard deviation of the local residual values for the models. As mentioned, exponential adaptive kernel appears to have the best fit for our data, so we choose these parameters for our final model.

gwr_out <- as.data.frame(bgwr.gnPr$SDF)
mesa_all$gauss_na_localRes<-(gwr_out$residual)/sd(gwr_out$residual)

gwr_out <- as.data.frame(bgwr.gPr$SDF)
mesa_all$gauss_localRes<-(gwr_out$residual)/sd(gwr_out$residual)

gwr_out <- as.data.frame(bgwr.nPr$SDF)
mesa_all$exp_na_localRes<-(gwr_out$residual)/sd(gwr_out$residual)

gwr_out <- as.data.frame(bgwr.Pr$SDF)
mesa_all$exp_localRes<-(gwr_out$residual)/sd(gwr_out$residual)

my.palette <- brewer.pal(n = 8, name = "RdYlBu")

gauss_na_localRes <-spplot(mesa_all,"gauss_na_localRes", main = "Gauss. Non-Adaptive", 
                     col="transparent", col.regions = my.palette, cuts = 7)

gauss_localRes<-spplot(mesa_all,"gauss_localRes", main = "Gauss. Adaptive", 
                         col="transparent",col.regions = my.palette, cuts = 7)

exp_na_localRes<-spplot(mesa_all,"gauss_localRes", main = "Expon. Non-Adaptive", 
                          col="transparent", col.regions = my.palette, cuts = 7)

exp_localRes<-spplot(mesa_all,"exp_localRes", main = "Expon. Adaptive", 
                             col="transparent", col.regions = my.palette, cuts = 7)

grid.arrange(gauss_na_localRes, gauss_localRes, exp_na_localRes, exp_localRes,
             ncol= 3, heights = c(20,20), top = textGrob("Stdev of Residuals", gp=gpar(fontsize=20)))

Below are the results of the geographically weighted poisson regression using an adaptive exponential kernel. The results show that the model has a pseudo R-squared value of 0.69, meaning that it is able to explain almost 70% of the variation in overdose cases. This is a substantial improvement from the 45% R-squared value of the baseline poisson model.

The results of the GWPR provide us with better insight into how the risk factors vary across the different census block groups. For example, the baseline poisson regression indicated that 1 additional arrest corresponded with a 0.068% increase in overdoses. However, with the GWPR we see that the increase can be over 3 times higher, corresponding to a 0.23% increase in overdoses. The broadness of the ranges in local estimates are even more pronounced for the Hispanic population as well as land zoned as high-density residential. The initial estimate from the baseline model indicated that one percentage point increase in the proportion of Hispanic residents was associated with a 0.7% increase in number of overdoses, but the GWPR reveals that it can be associated with an overdose increase of as high as 2.8% in some city areas. Meanwhile, estimates for increases in incidents corresponding to high-density residential zoning range from as low as 0.1% to as high as ~ 2.5%. The results also reveal that a majority of our distance-related risk factors, especially the distance to public greenspace and distance to public safety are almost always negatively correlated with overdose cases, regardless of the part of the city which disaffirms our original hypothesis that in blocks closer to these facilities there will be less incidents.

bgwr.Pr
##    ***********************************************************************
##    *                       Package   GWmodel                             *
##    ***********************************************************************
##    Program starts at: 2022-12-25 21:39:51 
##    Call:
##    ggwr.basic(formula = overdose_count ~ arrests_count + median_income + 
##     perc_black + perc_hispanic + perc_owners + resid_high_per + 
##     commercial_per + dist_greenspace + dist_vacant_prop + dist_graffiti + 
##     dist_public_safety, data = mesa_all, bw = bw.gwrEA, family = "poisson", 
##     kernel = "exponential", adaptive = TRUE, dMat = DM)
## 
##    Dependent (y) variable:  overdose_count
##    Independent variables:  arrests_count median_income perc_black perc_hispanic perc_owners resid_high_per commercial_per dist_greenspace dist_vacant_prop dist_graffiti dist_public_safety
##    Number of data points: 345
##    Used family: poisson
##    ***********************************************************************
##    *              Results of Generalized linear Regression               *
##    ***********************************************************************
## 
## Call:
## NULL
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.0000  -0.6807  -0.2278   0.4445  15.8330  
## 
## Coefficients:
##                         Estimate    Std. Error z value             Pr(>|z|)    
## Intercept           2.0595832100  0.1566417221  13.148 < 0.0000000000000002 ***
## arrests_count       0.0006844624  0.0000686423   9.971 < 0.0000000000000002 ***
## median_income      -0.0000008436  0.0000013771  -0.613               0.5401    
## perc_black          0.0116940923  0.0048565931   2.408               0.0160 *  
## perc_hispanic       0.0070092185  0.0013777334   5.087          0.000000363 ***
## perc_owners         0.0009652980  0.0014383300   0.671               0.5021    
## resid_high_per      0.0098350225  0.0011898858   8.266 < 0.0000000000000002 ***
## commercial_per      0.0045587548  0.0026782416   1.702               0.0887 .  
## dist_greenspace    -0.0000579105  0.0000143220  -4.043          0.000052664 ***
## dist_vacant_prop    0.0000025045  0.0000043933   0.570               0.5686    
## dist_graffiti      -0.0000081344  0.0000421016  -0.193               0.8468    
## dist_public_safety -0.0001149590  0.0000129847  -8.853 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 2322.2  on 344  degrees of freedom
## Residual deviance: 1269.0  on 333  degrees of freedom
## AIC: 1293
## 
## Number of Fisher Scoring iterations: 5
## 
## 
##  AICc:  1293.976
##  Pseudo R-square value:  0.4535294
##    ***********************************************************************
##    *          Results of Geographically Weighted Regression              *
##    ***********************************************************************
## 
##    *********************Model calibration information*********************
##    Kernel function: exponential 
##    Adaptive bandwidth: 18 (number of nearest neighbours)
##    Regression points: the same locations as observations are used.
##    Distance metric: A distance matrix is specified for this model calibration.
## 
##    ************Summary of Generalized GWR coefficient estimates:**********
##                                 Min.        1st Qu.         Median
##    Intercept          -0.22607694166  1.20067276202  1.73960798969
##    arrests_count       0.00042833565  0.00074185490  0.00107242046
##    median_income      -0.00001113456 -0.00000422743  0.00000013829
##    perc_black         -0.02218714651 -0.00127668722  0.00569746260
##    perc_hispanic      -0.00317040486  0.00372447667  0.00640199380
##    perc_owners        -0.00384157393  0.00040092182  0.00289802946
##    resid_high_per      0.00105202619  0.00577706051  0.01019328060
##    commercial_per     -0.01377220074  0.00243908237  0.01007332235
##    dist_greenspace    -0.00023266937 -0.00014462074 -0.00004561559
##    dist_vacant_prop   -0.00004565609 -0.00002088898 -0.00000954670
##    dist_graffiti      -0.00013024107  0.00001260078  0.00004873067
##    dist_public_safety -0.00026929248 -0.00014928048 -0.00010660187
##                              3rd Qu.   Max.
##    Intercept           2.47403573774 3.6311
##    arrests_count       0.00152285380 0.0023
##    median_income       0.00000312500 0.0000
##    perc_black          0.01224007904 0.0332
##    perc_hispanic       0.01012181509 0.0283
##    perc_owners         0.00740199548 0.0182
##    resid_high_per      0.01461808174 0.0242
##    commercial_per      0.01523986914 0.0230
##    dist_greenspace    -0.00001225402 0.0001
##    dist_vacant_prop    0.00000815925 0.0001
##    dist_graffiti       0.00009227430 0.0003
##    dist_public_safety -0.00006211557 0.0000
##    ************************Diagnostic information*************************
##    Number of data points: 345 
##    GW Deviance: 719.0314 
##    AIC : 889.3877 
##    AICc : 946.1102 
##    Pseudo R-square value:  0.6903717 
## 
##    ***********************************************************************
##    Program stops at: 2022-12-25 21:39:53

Given that the magnitude of the coefficients is very small, it is useful to examine the test statistics to investigate how the statistical significance of these values change spatially. Below are maps of the pseudo t-values for each of the risk factors.

From the results, we observe that the number of arrests is almost always statistically significant in all regions, but it is especially significant in the northeast, southeast and western-central areas of the city. Additionally, the hispanic population is increasingly significant closer to the center of the city while the black population is significant in the far west of the city. Distance to greenspace has a significant negative relationship with overdose incidents in the west of the city while distance to public safety has a significnat negative relationship with overdose cases in the east of the city. However, it is important to note when interpreting these results that thes are municipal-run facilities, so for example, our model is only taking into consideration city-owned parks are not accounting for private greenspace which might be available in other areas. Unsurprisingly, land zoned being zoned commercial is only significant in the east of the city which is reasonable as that is farther from the western downtown area where most of the neighborhoods would be zoned as commercial. On the other hand, distance to vacant property is positively statistically significant in the center of the city but negatively statistically significant in the in the southeast. Similarly, median income is positively significant near the downtown area and central-east, but there are enclaves of blocks where the relationship is negative. Finally, homeownership is mainly statistically significant in the northwest of the city.

mesa_all$t_arrests_count<-bgwr.Pr$SDF$arrests_count_TV
mesa_all$t_perc_black<-bgwr.Pr$SDF$perc_black_TV
mesa_all$t_perc_hispanic<-bgwr.Pr$SDF$perc_hispanic_TV
mesa_all$t_dist_public_safety<-bgwr.Pr$SDF$dist_public_safety_TV
mesa_all$t_resid_high_per<-bgwr.Pr$SDF$resid_high_per_TV
mesa_all$t_dist_greenspace<-bgwr.Pr$SDF$dist_greenspace_TV

mesa_all$t_median_income<-bgwr.Pr$SDF$median_income_TV
mesa_all$t_perc_owners<-bgwr.Pr$SDF$perc_owners_TV
mesa_all$t_commercial_per<-bgwr.Pr$SDF$commercial_per_TV
mesa_all$t_dist_vacant_prop<-bgwr.Pr$SDF$dist_vacant_prop_TV
mesa_all$t_dist_graffiti<-bgwr.Pr$SDF$dist_graffiti_TV
mesa_all$t_Intercept<-bgwr.Pr$SDF$Intercept_TV


my.palette <- brewer.pal(n = 7, name = "YlOrRd")

t_arrest_count<-spplot(mesa_all,"t_arrests_count", main=list(label=" Arrests ", cex=1),
                         col="transparent",col.regions = my.palette, cuts = 6)

t_perc_hispanic<-spplot(mesa_all,"t_perc_hispanic", main=list(label="Hispanic %", cex=1), 
                     col="transparent",col.regions = my.palette, cuts = 6)

t_perc_black<-spplot(mesa_all,"t_perc_black", main=list(label=" Black % ", cex=1), 
                  col="transparent",col.regions = my.palette, cuts = 6)

t_resid_high_per<-spplot(mesa_all,"t_resid_high_per", main=list(label="High Dens. Residential", cex=1), 
                           col="transparent", col.regions = my.palette, cuts = 6)

t_dist_greenspace<-spplot(mesa_all,"t_dist_greenspace", main=list(label="Dist to Greenspace", cex=1), 
                            col="transparent", col.regions = my.palette, cuts = 6)

t_dist_public_safety<-spplot(mesa_all,"t_dist_public_safety", main=list(label="Dist to Public Safety", cex=1),
                               col="transparent", col.regions = my.palette, cuts = 6)

t_median_income<-spplot(mesa_all,"t_median_income", main=list(label="Median Income", cex=1), 
                               col="transparent", col.regions = my.palette, cuts = 3)

t_perc_owners<-spplot(mesa_all,"t_perc_owners", main=list(label=" Homeowner % ", cex=1),
                               col="transparent", col.regions = my.palette, cuts = 6)

t_commercial_per<-spplot(mesa_all,"t_commercial_per", main=list(label="Commercial %", cex=1),
                               col="transparent", col.regions = my.palette, cuts = 6)

t_dist_vacant_prop<-spplot(mesa_all,"t_dist_vacant_prop", main=list(label="Dist to Vacant Prop.", cex=1), 
                               col="transparent", col.regions = my.palette, cuts = 6)

t_dist_graffiti<-spplot(mesa_all,"t_dist_graffiti", main=list(label="Dist to Graffiti", cex=1),
                               col="transparent", col.regions = my.palette, cuts = 6)

#t_Intercept<-spplot(mesa_all,"t_Intercept", main = "Intercept", 
                               #col="transparent", col.regions = my.palette, cuts = 6)

grid.arrange(t_arrest_count, t_perc_hispanic, t_perc_black, t_resid_high_per, t_dist_greenspace, 
             t_dist_public_safety, t_median_income, t_perc_owners, t_commercial_per,
             t_dist_vacant_prop, t_dist_graffiti,
              ncol = 4, top = textGrob("Local t-values", gp=gpar(fontsize=20)))

Discussion of Modelling Results

Discussion on the Modelling Results here - why we pick GWPR; how the city can target the west effectively based on the results.

Conclusion

Summary

References

Adams J. F. A. (1889), “Substitutes for opium in chronic diseases”, Boston Medical and Surgical Journal, 121, 15, 351-356.

Arizona Department of Health Services. (2022) Opioid Prevention [Online]. Available at: https://www.azdhs.gov/opioid/ (Accessed: 1 December 2022)

Brinkley-Rubinstein L., A. Macmadu, B. D. L. Marshall, A. Heise, S. I. Ranapurwala, J. D. Rich, T. C. Green (2018) “Risk of fentanyl-involved overdose among those with past year incarceration: Findings from a recent outbreak in 2014 and 2015”, Drug and Alcohol Dependence, 185 (2018), 189-191.

Burns J. M., R. F. Martyres, D. Clode and J. M. Boldero (2004) “Overdose in young people using heroin: Associations with mental health, prescription drug use and personal circumstances”, Medical Journal of Australia, 128, 7, 25-28.

Choi J. I., Lee J, Yeh A. B., Lan Q and Kang H, (2022) “Spatial clustering of heroin-related overdose incidents: a case study in Cincinnati, Ohio”, BMC Public Health, 22, 1253. https://doi.org/10.1186/s12889-022-13557-3

Draanen J. V., C. Tsang, S. Mitra, M. Karamouzian, and L. Richardson (2020), “Socioeconomic marginalisation and opioid-related overdose: A systematic review”, Drug and Alcohol Dependence, 214 (2020), 108-127.

Fotheringham, A.S., Brunsdon, C. and Charlton, M., (2003), “Geographically weighted regression: the analysis of spatially varying relationships”. John Wiley & Sons.

Gollini I., Lu B., Charlton M., Brunsdon C., and Harris P. (2013), “GWmodel: an R Package for Exploring Spatial Heterogeneity using Geographically Weighted Models”, arXiv, 2013. https://arxiv.org/abs/1306.0413

Gollini I., Charlton M., Brunsdon C., and Harris P. (2015), “GWmodel: an R Package for Exploring Spatial Heterogeneity using Geographically Weighted Models”, Journal of Statistical Software, 63(17).

Guy G. P., K. Zhang, M. K. Bohm, J. Losby, B. Lewis, R. Young, L. B. Murphy and D. Dowell (2017) “Vital signs: Changes in opioid prescribing in the United States, 2006-2015”, Morbidity and Mortality Weekly Report, 66, 26, 697-704.

Hollingsworth A., C. J. Ruhm and K. Simon (2017) “Macroeconomic conditions and opioid abuse”, Journal of Health Economics, 56 (2017), 223-233.

Kosten T. R. and T. P. George (2002) “The neurobiology of opioid dependence: Implications for treatment”, Science and Practice Perspectives, 1, 1, 13-20.

Krausz R. M., J. N. Westenberg and K. Ziafat (2021) “The opioid overdose crisis as a global health challenge”, Current Opinion in Psychiatry, 34, 4, 405-412.

Lippold K. M., Jones C. M., Olsen E. O. and Giroir B. P. (2019) “Racial/Ethnic and Age Group Differences in Opioid and Synthetic Opioid-Involved Overdose Deaths Among Adults Aged ≥18 Years in Metropolitan Areas - United States, 2015-2017”, MMWR Morb Mortal Wkly Rep, 68, 4, 967-73. doi: 10.15585/mmwr.mm6843a3.

Li Z. R., E. Xie, F. W. Crawford, J. L. Warren, K. McConnell, J. T. Copple, T. Johnson and G. S. Gonsalves (2019) “Suspected heroin-related overdose incidents in Cincinnati, Ohio: A spatiotemporal analysis”, PLoS Medicine, 16, 11, 1-15.

Lyden J. and I. A. Binswanger (2019) “The United States opioid epidemic”, Seminars in Perinatology, 43 (2019), 123-131.

Marshall J. R. S. F. Gassner, C. L. Anderson, R. J. Cooper, S. Lotfipour and B. Chakravarthy (2019) “Socioeconomic and geographical disparities in prescription and illicit opioid-related overdose deaths in Orange County, California, from 2010-2014”, Substance Abuse, 40, 1, 80-86.

Montandon G. and A. S. Slutsky (2019), “Solving the opioid crisis; Respiratory depression by opioids as critical end point”, Chest, 156, 4, 653-658.

Office of Governor Ducey. (2017) Governor Ducey Declares Statewide Health Emergency In Opioid Epidemic [Online]. Available at: https://azgovernor.gov/governor/news/2017/06/governor-ducey-declares-statewide-health-emergency-opioid-epidemic (Accessed: 1 December 2022)

Poliart, A., Kirakoya-Samadoulougou, F., Ouédraogo, M. (et al. (2021), “Using geographically weighted Poisson regression to examine the association between socioeconomic factors and hysterectomy incidence in Wallonia, Belgium.” BMC Women’s Health 21, 373.

Rosenblum A., L. A. Marsch, H. Joseph and R. K. Portenoy (2008), Opioids and the treatment of chronic pain: Controversies, current status and future directions”, Experimental and Clinical Psychopharmacology

Rosner B., J. Neicun, J. C. Yang and A. Roman-Urrestarazu (2019) “Opioid prescription patterns in Germany and the global opioid epidemic: Systematic review of available evidence”, PLoS One, 14, 8, 1-20.

Steif, Ken. (2021) Public Policy Analytics: Code & Context for Data Science in Government. [online]. Available at: https://urbanspatial.github.io/PublicPolicyAnalytics/index.html (Accessed: 1 December 2022)

Waldhoer M., S. E. Bartlett and J. L. Whistler (2004) “Opioid receptors”, Annual Review of Biochemistry, 73 (2004), 953-990.